IN THE CLAIMS 



1. (Currently Amended) An apparatus, in an integrated circuit (IC) of a data processing 
system having at least one host processor and host memory, comprising: 

a chip interconnect; 

a memory controller for controlling the host memory comprising DRAM memory, the 
memory controller coupled to the chip interconnect; 

a scalar processing unit coupled to the chip interconnect, the scalar processing unit 
being capable of executing instructions to perform scalar data processing; 

a vector processing unit coupled to the chip interconnect, the vector processing unit 
being capable of executing instructions to perform vector data processing; and 

an input and output (I/O) interface coupled to the chip interconnect, the I/O interface 
receiving/transmitting data from/to the scalar and/or vector processing units. 
wherein the chip interconnect, the memory controller, the scalar processing 
unit, the vector processing unit, and the I/O interface are implemented within 
the IC which is a single chipset interfacing the at least one host processor and 
the host memory with other components of the data processing system . 

2. (Currently Amended) The apparatus of claim 1, further comprising a switch mechanism 
coupled the chip interconnect and coupled to the scalar processing unit and coupled to the 
vector processing unit, the switch mechanism operable to receive multiple media data stream 
streams from the I/O interface and dispatch the multiple media data streams s tream-to the 
scalar processing unit and/or the vector processing unit. 

3. (Currently Amended) The apparatus of claim 1, further comprising: 
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multiple scalar processing units, the multiple scalar processing units being capable of 
executing instructions to perform scalar processing substantially 
simultaneously; and 

multiple vector processing units, the multiple vector processing units being capable of 
executing instructions to perform vector processing substantially 
simultaneously. 

4. (Original) The apparatus of claim 3, further comprising multiple scalar processing 
units of a kind and multiple vector processing unit of a kind. 

5. (Original) The apparatus of claim 1, further comprising: 

a general purpose register (GPR) file coupled to the scalar processing unit; 
a vector register (VR) file coupled to the vector processing unit; and 
a load and store unit (LSU), the LSU being capable of executing instructions to load 
and store scalar data from and to the GPR, and the LSU being capable of 
executing instructions to load and store vector data from and to the VR. 

6. (Original) The apparatus of claim 5, further comprising a memory location coupled to 
the chip interconnect, wherein the LSU loads and stores data from and to the memory 
location. 

7. (Original) The apparatus of claim 6, further comprising a direct memory access (DMA) 
engine, the DMA engine transferring the multiple media data between the memory location 
and the host memory. 
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8. (Original) The apparatus of claim 5, wherein the LSU is capable of executing 
instructions to load and store various formats of scalar and vector data, wherein the various 
formats comprise 8-bit, 16-bit, and 32-bit formats. 

9. (Currently Amended) The apparatus of claim 1, wherein the switch mechanism 
comprises an instruction unit (IUNTT), the IUNIT controlling and dispatching instructions 
substantially simultaneously. 

10. (Original) The apparatus of claim 9, wherein the instructions comprise very long 
instruction word (VLIW) instructions. 

11. (Original) The apparatus of claim 9, wherein the IUNTT further comprises: 

a program counter; 

a branch unit, wherein the program counter and the branch unit determine the location 

to fetch next instructions; 
an instruction cache memory, the instruction cache memory comprising instruction 

cache tag and data memories for buffering instructions transmitted from the 

host memory; and 
at least one memory mapped registers accessible by the host. 

12. (Original) The apparatus of claim 1, wherein the scalar processing unit comprises: 

an integer arithmetic and logic unit (IALU), the IALU being capable of executing 

instructions to perform simple scalar integer arithmetic and logical operations; 
and 

an integer shift unit (ISHU), the ISHU being capable of executing instructions to 
perform scalar bit shifting and rotating operations; 
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13. (Original) The apparatus of claim 12, wherein the scalar processing unit further 
comprises a floating point unit (FPU), the FPU being capable of executing instructions to 
perform high precision scalar data processing. 

14. (Original) The apparatus of claim 1, wherein the vector processing unit comprises: 

a vector permute unit (VPU), the VPU being capable of executing instructions to 

perform vector permute operations; 
a vector simple integer unit (VSIU), the VSIU being capable of executing instructions 

to perform vector simple integer arithmetic and logical operations; 
a vector complex integer unit (VCIU), the VCIU being capable of executing 

instructions to perform vector complex integer arithmetic operations; and 
a vector look-up table unit (VLUT), the VLUT being capable of executing instructions 

to perform at least one vector table look-up. 

15. (Original) The apparatus of claim 14, wherein the vector processing unit further 
comprises a vector floating point unit (VFPU), the VFPU being capable of executing 
instructions to perform high precision vector data processing. 

16. (Original) The apparatus of claim 14, wherein the VLUT comprises a memory location 
storing at least one look-up table (LUT). 

17. (Original) The apparatus of claim 16, wherein data of the LUT are transferred from the 
host memory to the memory location through a direct memory access (DMA) operation. 

18. (Original) The apparatus of claim 16, wherein the memory location comprises a static 
random access memory (SRAM). 
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19. (Original) The apparatus of claim 1, wherein the scalar and vector processing units are 
capable of performing data processing autonomously and asynchronously to the host 
processor. 

20. (Original) The apparatus of claim 1, wherein the scalar and vector processing units 
communicate with the host processing through an interrupt mechanism. 

21. (Original) The apparatus of claim 1, wherein the scalar and vector processing units are 
accessible by the host processor, through a set of memory mapped addresses. 

22. (Original) The apparatus of claim 1, wherein the IC may be a co-processor to the host, 
wherein the IC may be a stand-alone processor coupled to a bus of the data processing 
system, and wherein the chipset may be a core logic chip having a host interface coupled 
to the host processor and memory interface coupled to the host memory. 

23. (Original) The apparatus of claim 5, further comprises a special purpose register (SPR) 
file coupled to the chip interconnect. 

24. (Currently Amended) A method, in an integrated circuit (IC) having a chip interconnect, 
of a data processing system having at least one host processor and a host memory, the method 
comprising: 

receiving adata stream from an input/output (I/O) interface coupled to the chip 
interconnect; 

examining data of the dat a stream to determine whether the data fe^wfe -requires scalar 

data processing or vector data processing; 
performing scalar data processing on the data in the IC, if the data feqwe -requires 

scalar data processing; and 
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performing vector data processing on the data in the IC, if the data feqtwe -requires 

vector data processin g, wherein receiving the data stream, examining the data, 
the scalar data processing, and the vector data processing are performed within 
the IC which is a single chipset interfacing the at least one host processor and 
the host memory with other components of the data processing system . 

25. (Original) The method of claim 24, further comprising: 

a scalar processing unit coupled to the chip interconnect, the scalar processing unit 

performing scalar data processing on the data; and 
a vector processing unit coupled to the chip interconnect, the vector processing unit 

performing vector data processing on the data. 

26. (Currently Amended) The method of claim 25, further comprising multiple scalar 
processing units performing scalar data processing substantially simultaneously and multiple 
vector processing units performing vector data processing substantially simultaneously. 

27. (Currently Amended) The method of claim 25, further comprising: 

dispatching the data to the scalar processing unit if the data require requires scalar data 
processing; and 

dispatching the data to the vector processing unit if the data requir e requires vector 
data processing. 

28. (Original) The method of claim 27, wherein the dispatching is performed by a switch 
mechanism coupled to the chip interconnect, the switch mechanism receiving the data from 
the I/O interface. 
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29. (Original) The method of claim 28, wherein the switch mechanism comprises an 
instruction unit (IUOTT), the IUNTT being capable of decoding the data. 

30. (Original) The method of claim 24, further comprising: 

a general purpose register (GPR) file coupled to the scalar processing unit; 

a vector register (VR) file coupled to the vector processing unit; 

a special purpose register (SPR) coupled to the chip interconnect; and 

a load and store unit (LSU), the LSU being capable of executing instructions to load 
and store scalar data from and to the GPR, and the LSU being capable of 
executing instructions to load and store vector data from and to the VR. 

31. (Original) The method of claim 30, further comprising a memory location coupled to the 
chip interconnect, wherein the LSU loads and stores data from and to the memory location. 

32. (Original) The method of claim 3 1 , further comprising transferring the data between the 
memory location and the host memory, through a direct memory access (DMA) operation. 

33. (Original) The method of claim 24, wherein the scalar processing unit comprises: 

an integer arithmetic and logic unit (IALU), the IALU being capable of executing 

instructions to perform simple scalar integer arithmetic and logical operations; 

an integer shift unit (ISHU), the ISHU being capable of executing instructions to 
perform scalar bit shifting and rotating operations; and 

a floating point unit (FPU), the FPU being capable of executing instructions to 
perform high precision scalar data processing. 

34. (Original) The method of claim 24, wherein the vector processing unit comprises: 
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a vector permute unit (VPU), the VPU being capable of executing instructions to 

perform vector permute operations; 
a vector simple integer unit (VSIU), the VSIU being capable of executing instructions 

to perform vector simple integer arithmetic and logical operations; 
a vector complex integer unit (VCIU), the VCIU being capable of executing 

instructions to perform vector complex integer arithmetic operations; 
a vector look-up table unit (VLUT), the VLUT being capable of executing instructions 

to perform at least one vector table look-up; and 
a vector floating point unit (VFPU), the VFPU being capable of executing instructions 

to perform high precision vector data processing. 

35. (Original) The method of claim 34, wherein the VLUT comprises a memory location 
storing at least one look-up table (LUT). 

36. (Original) The method of claim 35, further comprising transferring data of the LUT from 
the host memory to the memory location, through a direct memory access (DMA) operation. 

37. (Original) The method of claim 24, wherein the scalar data processing and vector data 
processing are performed autonomously and asynchronously to the host processor. 

38. (Original) The method of claim 25, wherein the scalar processing unit and the vector 
processing unit communicate with the host processor through an interrupt mechanism. 

39. (Original) The method of claim 25, wherein the scalar processing unit and the vector 
processing unit are accessible by the host processing through a set of memory mapped 
addresses. 
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40. (Currently Amended) An apparatus, in an integrated circuit (IC) having a chip 
interconnect, of a data processing system having at least one host processor and a host 
memory, the apparatus comprising: 

means for receiving adata stream from an input/output (I/O) interface coupled to the 
chip interconnect; 

means for examining the data to determine whether the data r e quire requires scalar 
data processing or vector data processing; 

means for performing scalar data processing on the data in the IC, if the data require 
requires scalar data processing; and 

means for performing vector data processing on the data in the IC, if the data r e quire 
requires vector data processin g, wherein receiving the data stream, examining 
the data, the scalar data processing, and the vector data processing are 
performed within the IC which is a single chipset interfacing the at least one 
host processor and the host memory with other components of the data 
processing system . 



41. (Original) The apparatus of claim 40, further comprising: 

a scalar processing unit coupled to the chip interconnect, the scalar processing unit 

performing scalar data processing on the data; and 
a vector processing unit coupled to the chip interconnect, the vector processing unit 

performing vector data processing on the data. 

42. (Currently Amended) The apparatus of claim 41 , further comprising multiple scalar 
processing units performing scalar data processing substantially simultaneously and 
multiple vector processing units performing vector data processing substantially 
simultaneously. 
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43. (Currently Amended) The apparatus of claim 41 , further comprising: 

means for dispatching the data to the scalar processing unit if the data r e quire requires 

scalar data processing; and 
means for dispatching the data to the vector processing unit if the data requires f egtrife 

vector data processing. 

44. (Original) The apparatus of claim 43, wherein the dispatching is performed by a switch 
mechanism coupled to the chip interconnect, the switch mechanism receiving the data 
from the 170 interface. 

45. (Original) The apparatus of claim 44, wherein the switch mechanism comprises an 
instruction unit (IUNIT), the IUNIT being capable of decoding the data. 

46. (Original) The apparatus of claim 40, further comprising: 

a general purpose register (GPR) file coupled to the scalar processing unit; 

a vector register (VR) file coupled to the vector processing unit; 

a special purpose register (SPR) coupled to the chip interconnect; and 

a load and store unit (LSU), the LSU being capable of executing instructions to load 
and store scalar data from and to the GPR, and the LSU being capable of 
executing instructions to load and store vector data from and to the VR. 

47. (Original) The apparatus of claim 46, further comprising a memory location coupled to 
the chip interconnect, wherein the LSU loads and stores data from and to the memory 
location. 
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48. (Original) The apparatus of claim 47, further comprising means for transferring the data 
between the memory location and the host memory, through a direct memory access (DMA) 
operation. 

49. (Original) The apparatus of claim 40, wherein the scalar processing unit comprises: 

an integer arithmetic and logic unit (IALU), the IALU being capable of executing 

instructions to perform simple scalar integer arithmetic and logical operations; 

an integer shift unit (ISHU), the ISHU being capable of executing instructions to 
perform scalar bit shifting and rotating operations; and 

a floating point unit (FPU), the FPU being capable of executing instructions to 
perform high precision scalar data processing. 

50. (Original) The apparatus of claim 40, wherein the vector processing unit comprises: 

a vector permute unit (VPU), the VPU being capable of executing instructions to 

perform vector permute operations; 
a vector simple integer unit (VSIU), the VSIU being capable of executing instructions 

to perform vector simple integer arithmetic and logical operations; 
a vector complex integer unit (VCIU), the VCIU being capable of executing 

instructions to perform vector complex integer arithmetic operations; 
a vector look-up table unit (VLUT), the VLUT being capable of executing instructions 

to perform at least one vector table look-up; and 
a vector floating point unit (VFPU), the VFPU being capable of executing instructions 

to perform high precision vector data processing. 

51. (Original) The apparatus of claim 50, wherein the VLUT comprises a memory location 
storing at least one look-up table (LUT). 
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52. (Original) The apparatus of claim 5 1 , further comprising means for transferring data of 
the LUT from the host memory to the memory location, through a direct memory access 
(DMA) operation. 

53. (Original) The apparatus of claim 40, wherein the scalar data processing and vector data 
processing are performed autonomously and asynchronously to the host processor. 

54. (Original) The apparatus of claim 41, wherein the scalar processing unit and the vector 
processing unit communicate with the host processor through an interrupt mechanism. 

55. (Original) The apparatus of claim 41, wherein the scalar processing unit and the vector 
processing unit are accessible by the host processing through a set of memory mapped 
addresses. 

56. (Currently Amended) A machine readable medium having stored thereon executable 
code which causes a machine to perform a method, in an integrated circuit (IC) having a chip 
interconnect, of a data processing system having at least one host processor and a host 
memory, the method comprising: 

receiving adata stream from an input/output (I/O) interface coupled to the chip 
interconnect; 

examining the data to determine whether the data r e quire requires scalar data 

processing or vector data processing; 
performing scalar data processing on the data in the IC, if the data fequife -requires 

scalar data processing; and 
performing vector data processing on the data in the IC, if the data fe^nye -requires 

vector data processin g, wherein receiving the data stream, examining the data, 

the scalar data processing, and the vector data processing are performed within 
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the IC which is a single chipset interfacing the at least one host processor and 
the host memory with other components of the data processing system . 



57. (Original) The machine readable medium of claim 56, wherein the method further 
comprises: 

a scalar processing unit coupled to the chip interconnect, the scalar processing unit 

performing scalar data processing on the data; and 
a vector processing unit coupled to the chip interconnect, the vector processing unit 

performing vector data processing on the data. 

58. (Currently Amended) The machine readable medium of claim 57, wherein the method 
further comprises multiple scalar processing units performing scalar data processing 
substantially simultaneously and multiple vector processing units performing vector data 
processing substantially simultaneously. 

59. (Currently Amended) The machine readable medium of claim 57, wherein the method 
further comprises: 

dispatching the data to the scalar processing unit if the data require requires scalar data 
processing; and 

dispatching the data to the vector processing unit if the data requires r e quir e vector 
data processing. 

60. (Original) The machine readable medium of claim 59, wherein the dispatching is 
performed by a switch mechanism coupled to the chip interconnect, the switch mechanism 
receiving the data from the VO interface. 
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61. (Original) The machine readable medium of claim 60, wherein the switch mechanism 
comprises an instruction unit (IUNTT), the IUNTT being capable of decoding the data. 

62. (Original) The machine readable medium of claim 56, wherein the method further 
comprises: 

a general purpose register (GPR) file coupled to the scalar processing unit; 

a vector register (VR) file coupled to the vector processing unit; 

a special purpose register (SPR) coupled to the chip interconnect; and 

a load and store unit (LSU), the LSU being capable of executing instructions to load 
and store scalar data from and to the GPR, and the LSU being capable of 
executing instructions to load and store vector data from and to the VR. 

63. (Original) The machine readable medium of claim 62, wherein the method further 
comprises a memory location coupled to the chip interconnect, wherein the LSU loads and 
stores data from and to the memory location. 

64. (Original) The machine readable medium of claim 63, wherein the method further 
comprises transferring the data between the memory location and the host memory, through a 
direct memory access (DMA) operation. 

65. (Original) The machine readable medium of claim 56, wherein the scalar processing unit 
comprises: 

an integer arithmetic and logic unit (IALU), the IALU being capable of executing 

instructions to perform simple scalar integer arithmetic and logical operations; 

an integer shift unit (ISHU), the ISHU being capable of executing instructions to 
perform scalar bit shifting and rotating operations; and 
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a floating point unit (FPU), the FPU being capable of executing instructions to 
perform high precision scalar data processing. 

66. (Original) The machine readable medium of claim 56, wherein the vector processing 
unit comprises: 

a vector permute unit (VPU), the VPU being capable of executing instructions to 

perform vector permute operations; 
a vector simple integer unit (VSIU), the VSIU being capable of executing instructions 

to perform vector simple integer arithmetic and logical operations; 
a vector complex integer unit (VCIU), the VCIU being capable of executing 

instructions to perform vector complex integer arithmetic operations; 
a vector look-up table unit (VLUT), the VLUT being capable of executing instructions 

to perform at least one vector table look-up; and 
a vector floating point unit (VFPU), the VFPU being capable of executing instructions 

to perform high precision vector data processing. 

67. (Original) The machine readable medium of claim 66, wherein the VLUT comprises a 
memory location storing at least one look-up table (LUT). 

68. (Original) The machine readable medium of claim 67, wherein the method further 
comprises transferring data of the LUT from the host memory to the memory location, 
through a direct memory access (DMA) operation. 

69. (Original) The machine readable medium of claim 56, wherein the scalar data processing 
and vector data processing are performed autonomously and asynchronously to the host 
processor. 
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70. (Original) The machine readable medium of claim 57, wherein the scalar processing unit 
and the vector processing unit communicate with the host processor through an interrupt 
mechanism. 

71. (Original) The machine readable medium of claim 57, wherein the scalar processing unit 
and the vector processing unit are accessible by the host processing through a set of memory 
mapped addresses. 
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